The combinatorics of tandem duplication

نویسندگان

  • Luca Penso Dolfin
  • T. Wu
  • C. D. Greenman
چکیده

Tandem duplication is an evolutionary process whereby a segment of DNA is replicated and proximally inserted. The different configurations that can arise from this process give rise to some interesting combinatorial questions. Firstly, we introduce an algebraic formalism to represent this process as a word producing automaton. The number of words arising from n tandem duplications can then be recursively derived. Secondly, each single word accounts for multiple evolutions. With the aid of a bi-coloured 2dtree, a Hasse diagram corresponding to a partially ordered set is constructed, from which we can count the number of evolutions corresponding to a given word. Thirdly, we implement some subtree prune and graft operations on this structure to show that the total number of possible evolutions arising from n tandem duplications is n ∏ k=1 (4 − (2k + 1)). The space of structures arising from tandem duplication thus grows at a super-exponential rate with leading order term O(4 1 2n2).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The combinatorics of tandem duplication trees.

We developed a recurrence relation that counts the number of tandem duplication trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. The number of rooted duplication trees is exactly twice the number of unrooted trees, which means that on average only two ...

متن کامل

running head: COUNTING DUPLICATION TREES The Combinatorics of Tandem Duplication Trees

We develop a recurrence relation that counts the number of Tandem Duplication Trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. We find that the number of rooted duplication trees is exactly twice the number of unrooted trees, which means, on average, o...

متن کامل

Gene Family: Structure, Organization and Evolution

  Gene families are considered as groups of homologous genes which they share very similar sequences and they may have identical functions. Members of gene families may be found in tandem repeats or interspersed through the genome. These sequences are copies of the ancestral genes which have underwent changes. The multiple copies of each gene in a family were constructed based on gene duplicati...

متن کامل

Genaralized Neighbor Joining Approaches for Reconstructing Tandem Duplication History: a comparitive study

Motivation: Genomes are replete with short sequences repeated consecutively called tandem repeats. Reconstructing duplication histories for tandem repeats may yield valuable insights into their functions and the biological mechanisms of tandem repeat creation and extension. Results: We study the generalized neighbor-joining approaches for reconstructing tandem duplication history. We develop a ...

متن کامل

Neighbor Joining Approaches for Reconstructing Tandem Duplication History

Motivation: Genomes are replete with short sequences repeated consecutively called tandem repeats. Reconstructing duplication histories for tandem repeats may yield valuable insights into their functions and the biological mechanisms of tandem repeat creation and extension. Results: we design and implement a set of heuristic algorithms for reconstructing tandem duplication history with neighbor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Discrete Applied Mathematics

دوره 194  شماره 

صفحات  -

تاریخ انتشار 2015